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(57) Abstract 

The invention relates to an improved expression system for newly introduced genes in yeast and comprises a yeast 
regulon, and preferably a transcription terminator, derived from one of the GAPDH genes of S. cerevisiae. The about 850 
nucleotides long GAPDH regulon described proved to be almost ten times as effective as smaller regulons described previ- 
ously. Said regulon and/or terminator can be introduced into yeasts either as part of plasmide molecules or by incorporat- 
ing into the yeast genome. Vectors containing the expression system preferably comprises an autonomously replicating se- 
quence derived from K, lactis (KARS) or an origin of replication originatnag from the S, cerevisiae 2 micron yeast plasmid. 
After transformation of yeasts, in particular of the genera Kluyveromyces and Saccharomyces^ with said vectors, die yeasts 
transformed can produce foreign proteins more effectively. In the case of thaumatin the presence of a signal polypeptide 
(the pre-part) appears to be essential for expression in yeast DNA sequences encoding other signal polypeptides are also 
described. The use of codons preferred by yeast in both structural genes, preferably those encoding thaumatin-like proteins 
and chymosin-like proteins, and DNA sequences encoding signal polypeptides is also described. 
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IMPROVEMENTS IN THE EXPRESSION OF NEWLY INTRODUCED 
GENES IN YEAST CELLS 

The present invention relates to improvements- in the 
expression of newly introduced genes in yeast ceils. 

In particular the invention relates to a DNA sequence 
5 capable of initiating transcription by yeast RNA poly- 
merase II which includes at least pjart of the regulon 
region of one of the GAPDH genes of S. cerevisiae > 
(DNA =s DeoxyriboNucleic Acid; RNA = RiboNucleic Acid; 
GAPDH = GlycerAldehyde-3-Phosphate DeHydrogenase ; 
10 mRlTA - messenger RNA) • 

The invention further relates to the use of a DNA 
sequence capable of both termination of the tran- 
scription by yeast RITA polymerase II and effecting 
15 polyadenylation of the mRNA, which includes at least 
part of the terraination/polyadenylation region be- 
longing to one of the GAPDH genes of S. cerevisiae. 

The invention also relates to a larger rDNA sequence 
20 which contains at least the above-indicated regulon 

region of one of the GAPDH genes of S. cerevisiae and 
one or more structural genes different from the GAPDH 
genes of S- cerevisiae , which DNA sequence can be in- 
serted into a recombinant DNA plasmid or into a yeast 
25 chromosome in order to transform yeasts so that they 

become able to produce the desired proteins encoded by 
the structural genes. 

Finally, the invention relates to a process for pre- 
30 paring a protein by cultivating a yeast containing the 
above-mentioned larger rDNA sequences under conditions 
whereby the protein is formed and isolating the protein 
from that yeast culture, as well as the proteins pro- 
duced by such a process. 
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BACKGROUND OF THE INVENTION 

Developments in recombinant DNA (= rDNA) technology 
have made it possible to isolate or synthesize specific 
5 genes or portions thereof from higher organisms such as 
man, animals and plants, and to transfer these genes or 
gene fragments to microorganisms such as bacteria or ^ 
yeasts. The transferred gene is replicated and propa- 
gated as the transformed microorganism replicates. As a ^ 

10 result, the transformed microorganism may become en- 
dowed with the capacity to make whatever protein the 
transferred gene or gene fragment encodes whether it is 
an enzyme, a hormone, an antigen, an antibody, or a 
portion thereof. The microorganism passes on this capa- 

15 bility to its progeny, so that in effect, the transfer 
has resulted in a new microbial strain, having the 
described capability. 

A basic fact underlying the application of this tech- 
20 nology. for practical purposes is that DNA of all 

living organisms, from microbes to man, is chemically 
similar, being composed of the same four nucleotides. 
For example, the same nucleotide sequence which cof.es 
for the atmino acid sequence specifying preprochymosin 
25 in stomach cells of newborn mammals, will, when trans- 
ferred to a microorganism, be recognized as coding for 
the same amino acid sequence. 

The basic constituents of the recombinant DNA tech- 
30 nology are formed by: 

i) the gene encoding the protein of interest. ^ 
ii) a vector (plasmid) in which the new gene has to 

be inserted to guarantee stable replication and a i 
high level of expression of the gene. 
35 iii) a suitable host microorganism in which the vector 
carrying the new gene can be introduced. 
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Depending on the nature of the protein to be syn- 
thesized, the industrial application of this protein 
and the technicaUy possible fermentation and puri- 
fication proceduifes, the plasmid vector and the host 
I. . ) • 

5 organism have tp-.^be selected. In mosc cases the selec- 
tion of a host organism which is unsuspected with 
regard to the production of toxic substances, i.e. a 
microorganism mentioned in the GRAS (Generally 
Recognized As Safe) list, will be highly important. 

10 However, only veitfy few of these GRAS-microorganisras 

meet all requirements asXed for by the application of 
either the recombinant DNA or the fermentation tech- 
nology. A selection on the basis of these positive 
criteria shows \hat, at this moment, certain yeast 

15 species, notably-. Saccharomyces, Kluyveromyces, Debaryo- 
myces, HansenulV, Candida, Torulopsis and Rhodotorula, 
can be regarded-as very promising host organisms for 
genetically engineered DNA molecules. 

20 In the present invention use is made of recombinant 

DNA, molecular, biological and chemical techniques to 
construct plasmid vectors that can be stably maintained 
within yeasts and, most importantly, contain the appro- 
priate reguions to bring about a high level of ex- 

25 pression of newly introduced genes. 

Several plasmid* vectors are known nowadays which can be 
used for the transformation of the yeast Saccharomyces 
cerevisiae [C. P.Hollenberg, Current Topics in Microb. 

30 and Immunol. 96, 119-144 (1982)and A. Hinnen and B. 
Meyhack, Current Topics in Microb. and Immunol. 96, 
101-117 (1982)]. These vectors rely on either autono- 
mous replication sequences (ARS) isolated from the 
chromosomal DNA of particular yeast species or the 

35 replicating ability of the 2 micron DNA plasmid found 
in Saccharomyces cerevisiae to maintain the vector and 
the inserted gene within the host cell. Additionally 
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these yeast vectors contain a marker by which trans- 
formants can be selected. Examples of such markers are 
the leu 2 gene CA. Hinnen et al., Proc. Natl. Acad. Sci. 
USA, 75, 1929-1933(1978)], the trp 1 gene CK.'Struhl et 
5 al., Proc. Wat 1. Acad. Sci. USA, 76, 1035-1039(1979)], the 
lactase gene [R.C. Dickson, Gene, 10, 347-356 (1980)], 
and genes which confer resistance of the host cell 
against certain antibiotics. 

10 The stability of AR-sequences in S. cerevisiae or (K)AR 
sequences in Kluyveromyces lactis or K. fragilis is not 
always sufficient for the development of a reliable 
fermentation process using these yeasts. Therefore, 
integration of the foreign structural gene into the 

15 chromoscme{s ) of the new host cell can be very impor- 
tant for the industrial application of rDNA^containing 
yeasts. 

Seen from an economic point of view not only the 
20 stability of the inserted gene within the yeast cell is 
important but also the efficiency with which this gene 
is expressed as protein. Based on todays knowledge, the 
main routes to achieve high levels of expression of a 
newly inserted gene include: 

25 

- integration of the structural gene downstream of a 
promoter (RNA- initiation) site which can effect a 
high transcription frequency of the gene. Ideal is 
that the promoter activity is inducible, i.e. can be 

30 switched on or off depending upon a temperature shift 

or the presence of an inducer in the growth medium. ^ 
Potent promoters operating in yeasts are those 
responsible for transcription of the genes encoding . 
glycolytic enzymes. Experimental work done by Maitra 

35 and Lobo [J.Biol.Chem. , 246 , 489-499 (1971)] suggests 

furthermore that some . of these promoters are highly 
inducible. However a serious difficulty with regard 
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to the isolation of such promoters is that up to now 
little is known about the nucleotide sequences which 
confer regulation and full promoter activity on the 
DNA fragment 

5 

- integration of an RNA (RiboNucleic Acid) termination 
signal downstream of the structural gene. Owing to 
the presence of such a termination signal, tran- 
scription of the structural gene cannot interfere 

10 with the transcription of adjacent operons. Moreover 

transcription seems to be more efficient [K.S. Zaret 
and F. Sherman, Cell, 28, 563-573 (1982)] and poly- 
adenylation of the messenger is likely to occur 
correctly, resulting in a more stable mRNA (messenger 

15 RNA) population. However, up to now exact data on* 
nucleotide sequences required for termination of 
transcription in yeasts are not available. 

- the presence of a nucleotide sequence flanking the 
20 AUG codon in the RNA-molecule which is optimal for 

protein synthesis initiation. According to published 
data [M. Kozak, Nucleic Acids Res. 9, 5233-5252 
(1981)] the positions -3 and +4, N X X A U G N are 
highly conserved (A or C at -3 and G at +4), which 

25 observation suggests a role for these nucleotides in 

the recognition of the AUG codon as a translation 
start point by the ribosome.. Although one might 
expect an efficient yeast promoter to contain either 
an A or C as nucleotide at the -3 position, the 

30 nucleotide at the +4 position forms part of the 

coding sequence and is therefore dependent upon the 
nature of the gene to be inserted downstreeun of the 
promoter. This indicates that it will be difficult to 
fulfil this condition in all cases. 

35 

- the copy number of the vector within the host cell. 
In most cases high copy numbers will lead to higher 
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mRNA levels and, therefore, to higher expression 
levels of a gene. In S. cerevisiae vectors containing 
the 2 micron DNA replication origin can reach copy- 
niambers as high as 50. This is considerably. more than 
5 can be reached by for instance integrating vectors or 
vectors containing autonomously replicating sequences 
(ARS) [K. Struhl et al., Prbc. Nat 1. Acad ♦Sci* USA, 76, 
1035-1039 (1979) and A.J. Kingsman et al. , Gene, 7, 
141-152 (1979)]. However, in yeast species other than 

10 Saccharomyces , 2 micron DNA has not yet been found, 

suggesting that its replication origin will be 
functional in a very limited number of yeast species 
only. Experimental data obtained so far show that the 2 
micron replication origin is functional in 

15 Schizosaccharomyces pombe [D. Beach and P. Nurse, 

Nature, 290 , 140-142 (1981)], but not in Kluyveromyces 
lactis (G. Das and C.P. Hollenberg, Curr* Genet. 5^, 
123-128 (1982)]^ 

20 Therefore the transformation of yeasts belonging to 
genera other than Saccharomyces or Schizosaccharo- 
myces will be dependent in most cases upon the avail- 
ability of other DNA replication origins such as ARS 
isolated from the organism to be transformed or upon 

25 the integration of the foreign gene into the yeast 

genome. 

- a codon use of the gene which is optimal for the host 
organism used. Results obtained with the yeast £. 

30 cerevisiae show a strong correlation between the 

abundance of certain tRNA (transfer RNA) species and 
the occurrence of the respective codons in its 
protein genes. Therefore, optimal expression of for 
instance the bovine preprochymosin gene or the plant 

35 preprothaumatin gene in S. cerevisiae would require 
a chemical synthesis of both genes with a codon 
population which correlates with these abundant yeast 
tRNA species. 
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- an additional factor which might influence the trans- 
lation of a gene is the presence of a DNA sequence 
encoding a so-called signal sequence. These signal 
sequences are in most cases hydrophobic N-terminal 
5 protein extensions which are often involved in the 

process of cotranslational secretion of the protein 
through a membrane* In the present specification new 
data are shown obtained with the expression of the 
preprothaumatin gene in yeast, indicating that when 
10 the DNA sequence encoding the signal protein is 

removed from the gene, expression of the gene is 
reduced by a few orders of magnitude. 

At present the number of known yeast species which can 
15 serve as a host for recombinant DNA molecules is 
limited to Saccharomyces cerevisiae and Schizo- 
saccharomyces pombe . 

SUMMARY OF THE INVENTION 

20 

Therefore, a need exists for additional yeast species 
with other metabolic properties and other nutritional 
demands, so that the field to which recombinant DNA 
technology can be applied on a commercial level will be 

25 broadened. However, the use of new yeast species 

requires suitable DNA replication origins to guarantee 
a stable replication of the vector molecule in the new 
host as well as an appropriate expression system for 
newly inserted DNA. The present invention provides an 

30 expression system, the use of which is not restricted 
to S. cerevisiae but which can function in other yeast 
species as well. In order to achieve this the RNA 
initiation/regulation and the RNA termination/ 
polyadenylation signals of a glyceraldehyde-3 -phosphate 

35 dehydrogenase (GAPDH) gene of S. cerevisiae were 
isolated. 
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Besides the fact that the RNA initiation site of a 
GAPDH gene is a very efficient promoter in S. 
cerevisiae [M* J» Holland et al> Biochemistry 17/ 4900- 
4907 (1978)3 the promoter according to the present 
5 invention was isolated and applied after it was 

realized that GAPDH is a metabolic key enzyme in many 
different organisms / suggesting a certain conservatism 4 
during evolution with regard to the nucleotide sequence 
regulating its expression. On this basis further ex- ^ 

10 periments were carried out. It was observed (i) that 
the isolated and radioactive iy labelled GAPDH reguloh 
fragment hybridized with colonies of various yeast 
species, (ii) the S. cerevisiae GAPDH regulon expressed 
foreign genes in K. lactis efficiently and (iii) the 

15 . larger regulon region of about 850 nucleotides was more 
effective than the smaller regulon region of about 280 
nucleotides published by Holland J. P. and Holland M.J. 
in J.Biol.Chem. 255 , 2596-2605 (1980). These new 
findings gave us. confidence that the regulon isolated 

20 will prove to be useful in the expression of foreign 
DNA in a number of other yeast species. 

The present invention provides a DNA sequence capable 
of initiating transcription by yeast RNA polymerase II 

25 which includes at least part of the regulon region of 
one of the GAPDH genes of S. cerevisiae , characterized 
in that it comprises a DNA sequence essentially as 
given in Fig. 2, and wherein the regulon region is 
optionally modified to include at least. one restriction 

30 enzyme cleavage site, to facilitate manipulation of the 

nucleotide sequence region for protein synthesis - 
initiation. 

An example of a modification of the regulon is given in 
35 item 4 and Figs. 7A and 8, where the introduction of a 
Sac I site is described. Although in Fig. 2 only one 
specific DNA sequence is described, it will be clear to 
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the expert that modifications of this DNA sequence, 
either by replacement of one or more nucleotides or by 
addition or deletion of one or more nucleotides ^ which 
modifications do not impair the properties of- the 
5 regulon region given in Fig. 2, are within the realm 
of the invention. 

The invention further provides a DNA sequence capable 
of both termination of the transcription by yeast RITA 
10 polymerase II and effecting polyadenylation of the 

mWJA, which includes at least part of the termination/ 
polyadenylation region belonging to one of the GAPDH 
genes of S. cerevisiae # characterised in that it com- 
prises a DNA sequence essentially as given in Fig^ 3. 

15 

Indications have been obtained that the presence of 
such DNA sequence is favourable for the expression, of 
structural genes in yeasts. The presence of such DNA 
sequence seems to be particularly advantageous for in- 
20 corporating the rDNA in the genome of a yeast cell. 

The invention also provides a DNA sequence, which can 
be inserted into a recombinant DNA plasmid or into a 
yeast chromosome, comprising: 
25 (a) a DNA sequence essentially as given in Fig. 2, and 

(b) one or more structural genes different from the 
GAPDH genes of S. cerevisiae ^ and at least two of 
features (c)-(f), 

(c) one or more specific DNA sequences capable of ter- 
30 minating the transcription by yeast RNA polymerase 

II and effecting polyadenylation of the mRtTA, 
and/or 

(d) one or more selection markers, and/or 

(e) either one or more nucleotide sequences allowing a 
35 stable insertion in a chromosome of yeasts or one 

or more DNA sequences which regulate DNA repli- 
cation in yeasts belonging to the genus Saccharo- 
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myces or to the genus Kluyveromyces , and/or 
(f) a DNA sequence encoding a signal polypeptide of not 
more than 30 amino acid residues assisting the 
translocation of proteins, which DNA sequence is 
5 situated upstream of and in the same reading frcune 

as the structural gene. 

In particular, the structural gene of (b) encodes thau- 
matin or chymosin, or their various allelic or modified 

10 forms r which modified forms do not impair the sweet- 
tasting properties of thaumatin or the milk-clotting 
properties of chymosin, respectively, or precursors of 
these thaumatin-likie or chymosin-like proteins, since 
such DNA sequence will assist in giving an increased 

15 expression of these genes. 

A preference exists to use a DNA sequence essentially 
as given in Fig. 3 as the specific DNA sequence of (c), 
since this seems more adapted to yeast than other 
20 termination/polyadenylation regions. 

The DNA sequence of (f ) encoding a signal polypeptide 
can be selected from the group consisting of DNA 
sequences encoding 
25 (a) the signal polypeptide translocating S. cerevisiae 
invertase , namely Met . Leu . Leu . Gin . Aljs^ . Phe . Leu . Phe . - 
Leu . Leu . Ala » G ly . Phe . Ala . Ala . Ly s . lie . Ser . Ala ; 

(b) the signal polypeptide translocating S. cerevisiae 
acid phosphatase, namely Met.Phe.Lys.Ser.Val..Valt- 

30 Ty r . Ser •lie. Leu . Ala . Ala . Ser . Leu . Ala . Asn . Ala ; 

(c) the signal polypeptide of unmatured forms, of 
thaumatin^like proteins, namely Met.Ala.Ala.Thr .- 
Thr . Cys . Phe . Phe . Phe . Leu . Phe . Pro . Phe . Leu . Leu . Leu . - 
Leu • Thr • Leu . Ser • Ar g • Ala ; 

35 (d) the signal polypeptide of unmatured forms of chym- 
osin-like proteins, namely Met. Arg.Cys. Leu. Val.- 
Va 1 . Leu . Leu . Ala . Val . Phe . Ala . Leu . Ser . Gin • Gly ; and 
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10 



15 



20 



(e) two consensus signal polypeptides, namely 

Met.Ser^Lys.Ala.Ala.Leu.Ala.Phe.Ile.Ala.Phe.Val.- 

Ile.Val.Leu.Ile.Val.Asn.Ala and 

Met.Ser.Lys.Phe.Val.Ile.Val.Leu.Ile-Val.A14.Ala.- 
Leu.Ala.Phe.Ile.Ala-Asn.Ala. 

In order to make the conditions for expression as good 
as possible it is advocated to modify the structural 
gene of (b) and/or the signal polypep tide-encoding DNA 
sequence of (f) such that the codons are codons pre- 
ferred by yeasts. Following the teaching of J. P. 
Holland and M.J. Holland ( J. Biol . Chem. 255, 2596-2605 
[1980]) the preferred codons are: 
GCC or GCT alanine 



AGA 
AAC 

GAC or GAT 

TGT 

CAA 

GAA 

GGT 

CAC 

ATC or ATT 



arginine 

asparagine 

aspartic acid 

cysteine 

glutamine 

glutamic acid 

glycine 

histidine 

isoleucine 



TTG 
AAG 
ATG 
TTC 
CCA 

TCC or TCT 
ACC or ACT 
TGG 
TAC 

GTC or GTT 



leucine 
lysine 
methionine 
phenylalanine 
proline 
serine 
threonine 
tryptophan 
tyrosine 
valine. 



25 The DNA sequences described above are not a purpose in 
themselves. They will be used in a process for pre- 
paring yeasts containing these DNA sequences by intro- 
ducing these rDNA sequences into Saccharomyces , 
Kluyveromyces , Debaryomyces , Hansenula » Candida , 

30 Torulopsis or Rhodotorula yeasts, either in the form of 
plasmids or by incorporation in the yeast genome. 

The invention further provides yeasts containing such 
rDNA sequences, either in the form of a plasmid or in- 
35 corpora ted in the yeast genome and their use in a pro- 
cess for preparing a protein by cultivation of such 
yeast, whereby the rDNA sequence incorporated in the 
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yeast contains a structural gene encoding that protein 
or a precursor thereof/ which precursor can form the 
relevant protein during processing. 

5 Finally, the invention provides the proteins produced 
by this novel process. 

Several constructions in which the 6AP0H promoter/ 
regulation region was combined with structural genes 

10 encoding either preprothaumatin or preprochymosin or 

some of the various maturation products have been made 
and synthesis of thaumatin-like proteins in yeast have 
been demonstrated. Preferred constructions contained 
- structural genes encoding either preprothaumatin or 

15 prethaumatin. To improve the expression yield of said 
genes, the original codons can be replaced by codons 
which are abundantly present in highly expressed genes. 
Moreover, the DNA sequence encoding the signal sequence 
of preprochymosin can be replaced by DNA sequences en- 

20 coding signal sequences of products excreted by yeasts, 
such as the signal sequences invertase and acid phos- 
phatase produced by cerevisiae. 

The DNA sequences according, to the invention may com- 
25 prise nucleotide sequences which regulate DNA repli- 
cation in yeasts belonging to the genera Saccharomyces 
and Kluyveromyces . If the replication origin is in- 
serted into em rDNA plasmid, the following combinations 
are preferred for Saccharomyces: a combination of the 
30 replication origin of the 2 micron DNA with the leu 2 
gene as is present on plasmid pMPBl [C-P. Holienberg, 
Current Topics in Microbiol, and Immunol. 96^, 119-144 
(1982)3 and a combination of the replication origin of 
the 2 micron DNA in combination with the trp 1 gene 
35 present on plasmid YRp7 [D.T. Stinchconib et al.. Nature 
282 , 39-43 (1979) 3* A preferred replication origin for 
Kluyveromyces consists of the KARS-2 sequence in com- 
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bination with the trp 1 gene as is present on plasmid 
pEK 2-7 described in European Patent Application 

0096430(A1) on pages 21 and 25 and in Pig. 2 
If the DNA sequence has to be inserted into a yeast 
5 genome, it is preferable that the DNA sequence contains 
the termination region of Fig- 3 downstream of the 
structural gene besides the regulon region of Fig. 2 
upstream of the foreign structural gene, which combin- 
ation will be inserted by homologous recombination at 

10 the position of the GAPDH gene in the yeast genome (cf. 
Fig. 19) . An alternative may be that the combination 
regulon region - structural gene - termination region 
is inserted into a cloned DNA sequence derived from 
yeast genome (K. Struhl, Nature 305, 391-397 [1983] and 

15 R.J. Roths tein. Methods in Enzymology 101 # 202-211 
[19833) . 



For a better understanding of the invention the most 
important terms used in the description will be 
20 defined: 

An operon is a gene comprising (a) a particular DNA 
sequence [structural gene(s)] for polypeptide (s) ex- 
pression, (b) a control region or regulon (regulating 
said expression) upstream of the structural gene and 
25 mostly consisting of a promoter regulation sequence, 

(c) a ribosome binding- or interaction DNA sequence and 

(d) a control region or transcription terminator 
downstream of the structural gene. 

30 Structural genes are DNA sequences which encode through 
a template (mRNA) a sequence of amino acids charac- 
teristic for a specific polypeptide. 

A promoter is a DNA sequence within the regulon to 
35 which RNA polymerase binds for the initiation of the 
transcription. 

A terminator is a DNA sequence within the operon 
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comprising amongst others particular. DNA sequences 
involved in the polyadenylation of mRNA and particular 
DNA sequences involved in the termination of the 
transcription of DNA by RNA-polymerase* 

5 

Reading frame « The grouping of triplets of nucleotides 
(codons) into such a frame that at mRNA level a proper 
translation of the codons into the polypeptide takes 
place. 

10 

Transcription . The process of producing mRNA from a 
structural gene. 

Translation . The. process of producing a polypeptide 
15 from mRNA. 

Expression > The process undergone by a structural gene 
to produce a polypeptide. It is a combination of many 
processes r including at least transcription and 
20 translation. 

By signal sequence (also called signal polypeptide or 
leader sequence) is meant that part of the pre (pro) - 
protein which has a high affinity to biomerabranes or 
25 special receptor-proteins in biomembranes and/or which 
is involved in the transport/ trans location of pre{pro)- 
protein. These transport/ translocation processes are 
often accompanied by processing of the pre (pro) protein 
into one of the mature forms of the protein. 

30 

Allelic form . One of the two or more naturally 
occurring alternative forms of a gene product. 

Chromosome . Thread-like structures into which the 
35 hereditary material of cells is associated. 



Genome. The total genetic information of cells or- 
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ganised in the chromosomes. 

Maturation form * One of two or more naturally occur- 
ring forms of a gene product procured by specific 
5 processing/ e.g. specific proteolysis. 

By maturation forms of preprothaumatin are meant 
prothaumatin, prethaumatin (= preprothaumatin without 
a carboxyl terminal sequence of 6 amino acids) and 
thaumatin (EP-PA 54330 and EP-PA 54331). 
10 By maturation forms of preprochymosin are meant 
prochymosin, pseudo-chymosin and chymosin (EP-PA 
77109). 

Plus strand . DNA strand whose nucleotide sequence 
15 is identical with the mRNA sequencer with the proviso 
that uridine is replaced by thymidine. 

The microbial cloning vehicles - containing (a) the 
various forms of the regulons and terminators (indi- 

20 cated with the suffix -01, -02, etc. in the plasmids 

described in the specification) of the glyceraldehyde- 
3 -phosphate dehydrogenase (GAPDH) operon of S^. 
cerevisiae ^ (b) structural genes encoding prepro- 
thaumatin and preprochymosin and their various 

25 maturation forms, (c) various hybrid forms of said 

structural genes encoding maturation forms of prepro- 
thaumatin or preprochymosin with special signal 
sequences and (d) various chemically synthesized DNA- 
sequences - are produced by a number of steps, the most 

30 essential of which are; 

1. Isolation of .clones containing the GAPDH operon of 
S. cerevisiae . 

2. Isolation of the GAPDH promoter/regulation region 
35 and its introduction into plasmids encoding 

thaumatin-precursors • 

3. Introduction of the GAPDH promoter/regulation 
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region into plasmids encoding chymosin-precursors. 

4. Reconstitution of the original GAPDH proraotor/ 
regulation region in plasmids encoding prepro- 
thaumatin by introduction of a synthetic DilA frag- 
ment {Fig. 7A, Fig. 8). 

5. Reconstitution of the original GAPDH promoter/ 
regulation region in plasmids encoding 

(pre) (pro) (pseudo)chymosin by introduction of a 

synthetic DNA fragment • ' * 

6. DNA-synthesis . 

7. Structural features of the GAPDH promo ter/regu la- . 
tion region. 

8. Insertion of fragments of the GAPDH transcription 
termination/polyadenylation region in combination 
with the central transcription termination signal 
of phage M13RF downstream of genes encoding 
pseudochymosin. 

9. Introduction of the 2 micron DNA replication origin 
and the yeast leu 2 gene in plasmids encoding 
thaumatin-precursors and chymosin-precursors. 

10. The introduction of GAPDH transcription termination/ 
polyadenylation regions into pURY plasmids • 

11. Construction of an E. coli- yeast shuttle vector 
widely applicable for gene-expression in yeast. 

12. Expression in K. lactis of the preprothauraatin 
encoding gene under control of the promoter/ 
regulation region of the GAPDH encoding gene of 
S^. cerevisiae . 

13. Integration of structural genes under control of 
the GAPDH promoter/ regulation region into the yeast 
chromosome. ^ 

14. Chemical synthesis of structural genes and con- 
struction of synthetically chimeric genes. ?t 

The following detailed description will illustrate the 
invention. 

1. Isolation of clones containing the glyceraldehyde- 
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3 -phosphate dehydrogenase (GAPDH) operon of 
cerevisiae * 

A DNA pool of the yeast cerevisiae was prepared 
5 in the hybrid E. coli-yeast plasmid pPl 1 [M. 

Chevallier et al> Gene 11. 11-10 (1980)] by a method 
similar to the one described by M. Carlson and D. 
Botstein. Cell 28, 145-154 (1982), Purified yeast 
DNA was partially digested with restriction 

10 endonuclease Sau 3A and the resulting DNA fragments 

(with an average length of 5 kb) were ligated by T4 
DNA ligase in the dephosphorylated Bam HI site of 
pFl 1. After transformation of CaCl2-treated E. 
coli cells with the ligated material a pool of about 

15 30.000 ampicillin resistant clones was obtained. 

These clones were screened by a colony hybridization 
procedure CR*E. Thayer, Anal. Biochem. , 98, 60-63 
(1979)] with a chemically synthesized and ^^P- 
labelled oligomer with the sequence 

20 5 ' TACCAGGAGACCAACTT3 ' . 

According to data published by J. P. Holland and M.J- 
Holland [J. Biol. Chem. , 255, 2596-2605, (1980)] 
this oligomer is complementary with the DNA sequence 
encoding aminoacids 306-310 (the wobble base of the 

25 last amino acid was omitted from the oligomer) of 

the GAPDH gene. Using hybridization conditions 
decribed by R.B. Wallace et al> , Nucleic Acid Res. 
9, 879-894 (1981), six positive transformants could 
be identified. One of these harboured plasmid pPl 1- 

30 33. The latter plasmid contained the GAPDH gene 

including its promoter/ regulation region and its 
transcription termination/polyadenylation region. 

The approximately 9 kb long insert of pPl 1-33 has 
35 been characterized by restriction enzyme analysis 

(Pig. 1) and partial nucleotide sequence analysis 
( Figs • 2 and 3 ) • 
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Note: Unless stated otherwise, all enzyme 

incubations were carried out under conditions 
described by the supplier. Enzymes were 
obtained form Amersham, Boehringer, BRL or 
Biolabs. 

2. Isolation of the GAPDH promoter /regulation region 
and its introduction into plasmids encoding 
thaumatin-precursors (Fig. 4)» 

On the basis of the restriction enzyme analysis and 
the nucleotide sequence data of the insert of 
plasmid pFl 1-33, the RNA initiation regulation 
region of the GAPDH gene was isolated as an 800 
nucleotides long Dde I fragment. To identify this 
promoter fragment, plasmid pPl 1-33 was digested 
with Sal I and the three resulting DNA fragments 
were subjected to a Southern hybridization test 
with the chemically synthesized oligomer [E.M. 
Southern, J.Mol.Biol* 98, 503-517 (1975)], A posi- 
tively hybridizing 4.3 Icb long restriction fragment 
was isolated on a preparative scale by electroelu- 
tion from a 0.7% agarose gel and was then cleaved 
with Dde 1. Of the resulting Dde I fragments only 
the largest one had a recognition site for Pvu II, 
a cleavage site located within the GAPDH promoter 
region (Fig. 1) . The largest Dde I fragment was 
isolated and incubated with Klenow DNA polymerase 
and four dNTP's {A.R. Davis et al . , Gene 10^, 205- 
218 (1980)} to generate a blunt-ended DUA molecule. 

After extraction of the reaction mixture with 
phenol/chloroform (SO/SO v/v), passage of the 
aqueous layer through a Sephadex G50 colvimn and 
ethanol precipitation of the material present in 
the void volume, the DNA fragment was equipped with 
the ^2p-labelled Eco RI linker 5'GGAATTCC3' by 
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incubation with T4 DNA ligase. Owing to the Klenow 
DNA polymerase reaction and the subsequent ligation 
of the Eco RI linker, the original Dde I sites were 
reconstructed at the ends of the promoter fragment. 
5 To inactivate the ligase the reaction mixture was 

heated to 65*C for 10 minutes, then sodium chloride 
was added (final concentration 50 mmol/l) and the 
whole mix was incubated with Eco RI. Incubation was 
terminated by extraction with phenol/chloroform, 

10 the DNA was precipitated twice with ethanol, re- 

suspended and then ligated into a suitable vector 
molecule. Since the Dde I promoter fragment was 
equipped with Eco RI sites it can be easily 
introduced into the Eco RI site of pUR 528, pOR 523 

15 and pUR 522 (EP-PA 54330 and EP-PA 54331) to create 

plasmids in which the yeast promoter is adjacent to 
the structural genes encoding thauraatin precursors. 
The latter plasmids were obtained by cleavage of 
pUR 528, pUR 523 and pUR 522 with Eco RI, treatment 

20 of the linearized plasmid molecules with (calf 

intestinal) alkaline phosphatase to prevent self- 
ligation and incubation of each of these vector 
molecules, as well as the purified Dde I promoter 
fragment, with T4 DNA ligase. Transformation of the 

25 various ligation mixes in CaCl2-treated E. coli 

HBlOl cells yielded several ampicillin resistant 
colonies. From some of these colonies plasmid DNA 
was isolated [H.C. Birnboim and J, Doly, Nucleic 
Acids Res. 7, 1513-1523 (1979)] and incubated with 

30 Pvu II to test the orientation of the insert. 

In the nomenclature plasmids containing the Eco RI 
(Dde I) GAPDH promoter fragment in the correct 
orientation (i.e. transcription from the GAPDH 
35 promoter occurs in the direction of a downstream 

located structural gene) are indicated by the 
addendum-01 to the original code of the plasmid 
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(for example pUR 528 is changed into pUR 528-01; 
see Fig. 4) . 

To facilitate manipulation of plasmids containing 
5 the Eco RI promoter fragment, one of the two Eco RI 

sites was destroyed. Two yug of plasmid DNA (e.g. 
pUR 528-01} was partially digested with Eco RI and 
then incubated with 5 units Mung bean nuclease 
(obtained from P.L. Biochemicals Inc.) in a total 

10 volume of 200 ^ul in the presence of 0.05 moles 

per 1 sodium acetate (pH 5.0), 0.05 moles/1 sodium 
chloride and 0.001 moles/l zinc chloride for 30 
minutes at room temperature to remove sticJcy ends» 
The nuclease was inactivated by addition of SDS to 

15 a final concentration of 0.1% [D. Kowalski et al. , 

Biochemistry IS, 4457-4463 (1976)] and the DNA was 
precipitated by the addition of 2 volumes of 
ethanol (in this case the addition of 0.1 volume of 
3 moles/l sodium acetate was omitted). Linearized 

20 DNA molecules were then religated by incubation 

with T4 DNA ligase and used to transform CaCl2*- 
treated E. coli cells. Plasmid DNA isolated from 
ampicillin resistant colonies was tested by 
cleavage with Eco RI and Mlu I for the presence of 

25 a single Eco RI site adjacent to the thaumatin gene 

(cf Fig. 4). 

Plasmids containing the GAPDH promoter fragment, 
but having only a single Eco RI recognition site 
30 adjacent to the ATG initiation codon of a down- 

stream located structural gene, are referred to as 
-02 type plasmids (for example: pUR 528-01 is 
changed into pUR 528-02; see Fig. 4). 

35 3. Introduction of the GAPDH promoter/regulation 

region into plasmids encoding chymosin-precursors. 
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To construct plasmids containing the GAPDH 
promoter/regulation region adjacent to structural 
genes encoding chymosin precursors, use was made of 
plasmid pUR 528-02 (cf 2) digeste^ with Eco Rl and 
5 Hind III as a vector molecule in which various 

structural genes were inserted • To overcome the 
problem that all of the (pre) (pro) (pseudo) chymosin 
encoding plasmids (pUR 1524, pUR 1523 and pUR 1521, 
respectively, as described in EP-PA 77109) contain 

10 an additional Eco RI site within the structural 

gene, a Hind III digestion was carried out in com- 
bination with a partial Eco RI digest. Restriction 
fragments containing the intact and isolated gene 
were extracted from the 1% agarose gel and added to 

15 a T4 DNA ligation mix together with the vector. The 

vector was prepared by digesting pUR 528-02 with 
Hind III and Eco RI, treating the resulting frag- 
ments with phosphatase and isolation of the largest 
fragment from a 0.7% agarose gel. After trans- 

20 formation of CaClj-treated E- coli cells with the 

ligation mix and selection for ampicillin resistant 
colonies, plasmids containing the GAPDH promoter/ 
regulation fragment adjacent to the (pre)(pro)- 
( pseudo) chymosin encoding structural genes in the 

25 appropriate orientation could be isolated. In a 

similar way, plasmid pUR 1522 can be converted into 
plasmid puR 1522-02. 

Plasmids containing the GAPDH promoter fragment and 
30 the genes coding for prepro-, pro-, pseudochymosin 

and chymosin are referred to as pUR 1524-02, pUR 
1523-02, pUR 1521-02 and pUR 1522-02, respectively. 

4. Reconstitution of the original GAPDH promoter/ 
35 regulation region in plasmids encoding preprothau- 

matin by introduction of a synthetic DNA fragment 
(Pig. 7A, Pig. 8). 
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As shown by the nucleotide sequence depicted in 
Fig. 2, the Eco RI (Dde I) GAPDH promoter fragment 
contains the nucleotides -844 to -39 of the origi- 
nal GAPDH promoter/ regulation region. Not contained 
5 in this promoter fragment are the 38 nucleotides 

preceding the ATG initiation codon of the GAPDH 
encoding gene. The latter (38) nucleotide fragment 
contains the PuCACACA sequence, which is found in 
several yeast genes. Said PuCACACA sequence located 
10 about 20 bp upstream of the translation start site 

[M.J. Dobson et al. , Nucleic Acid Res. , 10* 2625- 
2637 (1982)] provides the nucleotide sequence which 
is optimal for protein initiation CM. Kozak, 
Nucleic Acids Res. 9, 5233-5252 (1981)]. Moreover. 
15 as shown in Fig. 6, these 38 nucleotides allow the 

formation of a small loop structure which might be 
involved in the regulation of expression of the 
GAPDH gene. 

20 On the basis of the above-mentioned arguments, in- 

troduction of the 38 nucleotides between the Dde I 
promoter- fragment and the ATG codon of a downstream 
located structural gene was considered necessary to 
improve promoter activity as well as translation 
25 initiation. 



As outlined in Fig. 7A the missing DNA fragment was 
obtained by the chemical synthesis of two partially 
overlapping oligomers. The Sac I site present in 

30 the overlapping part of the two oligonucleotides 

was introduced for two reasons: (i) to enable & 
manipulation of the nucleotide sequence immediately 
upstream of the ATG codon including the construe- ^ 
tion of poly 'A-tailed yeast expression vectors (see 

35 11)7 (ii) to give a cleavage site for an enzyme 

generating 3 • rprotruding ends that can easily and 
reproducibly be removed by incubation with T4 DNA . 
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polymerase in the presence of the four dNTP's. 
Equimolar amounts of the two purified oligomers 
were phosphorylated at their 5 '-termini, hybridized 
CJ,J. Rossi et al ., (1982), J. Biol. Chem.V257, 
5 9226-9229] and converted into a double-stranded DNA 

molecule by incubation with Klenow DNA polymerase 
and the four dNTP's under conditions which have 
been described for double-stranded DNA synthesis 
[A.R. Davis et al w Gene 10, 205-218 (1980)]. 

10 Analysis of the reaction products by electro- 

phoresis through a 13% acrylamide gel followed by 
autoradiography showed that more than 80% of the 
starting single-stranded oligonucleotides were 
converted into double-stranded material. The DNA 

15 was isolated by passage of the reaction mix over a 

Sephadex 650 column and ethanol precipitation of 
the material present in the void volume. The DNA 
was then phosphorylated by incubation with poly- 
nucleotide kinase and digested with Dde I. To 

20 remove the nucleotides cleaved off in the latter 

reaction, the reacton mix was subjected to two 
precipitations with ethanol. 

As shown in Fig. 8, cloning of the resulting syn- 
25 thetic DNA fragment was carried out by the simul- 

taneous ligation of this fragment and a Bgl II-Dde 
I GAPDH promoter regulation fragment in a vector 
molecule from which the Eco HI site preceding the 
ATG initiation codon was removed by Mung bean 
30 nuclease digestion (cf. 2). The Bgl II-Dde I 

promoter/regulation fragment was obtained by 
digestion of plasmid pUR 528-02 with Dde I and Bgl 
II. Separation of the resulting restriction frag- 
ments by electrophoresis through a 2% agarose gel 
35 and subsequent isolation of the fragment from the 

gel yielded the purified 793 nucleotides long 
promoter/regulation fragment. In the plasmid pUR 
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528-02 the nucleotide sequence preceding the ATG 
codon is 5 ' -GAATTCATG- 3' (EP-PA 54330 and EP-PA 
54331) , which is different from the favourable 
nucleotide sequence given by M. Kozak CNucieic 
5 Acids Res. 9, 5233-5252 (1981)]. Since our aim was 

to reconstitute the original GAPDH promoter /regula- 
tion/protein initiation region as accurately as 
possible, the Eco RI site was removed in order to 
ligate the synthetic DNA fragment to the resulting 
10 blunt-end. Removal of the Eco RI site was accom- 

plished by Mung bean nuclease digestion of Eco RI- 
cleaved pUR528-02 DNA (see 2). 

Subsequently the plasmid DNA was digested with Bgl 
15 II and incubated with phosphatase. After separation 

of the two DNA fragments by electrophoresis through 
a 0.7% agarose gel, the largest fragment was 
isolated and used as the vector in which the Bgl 
II-Dde I promoter fragment as well as the -Dde I- 
20 treated- synthetic DNA fragment were ligated. 

Plasmids in which the D<ie I promoter/regulation 
fragment together with the Sac I recognition 
site containing . the synthetic DNA fragment are 
introduced are indicated by the addendum -03 (for 
25 example: pUR 528-02 is changed into pUR 528-03). 

Similar results can be obtained with plasmids 
containing one of the maturation forms of prepro- 
thaumatin as the structural gene, i.e.. prethau- 
30 matinr prothaumatin and thaumatin, which will 

result in plasmids pUR 522-03, pUR 523-03 and 
pUR 520-03, respectively. 

In order to reconstitute the original GAPDH 
35 promoter/ regulation region as accurately as 

possible, the Sac I site was removed from plasmid 
pUR 528-03. (Fig. 9). This was accomplished by 
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digestion of the plasmid DNA with Sac I to generate 
a linearized plasmid molecule with protruding 3* 
ends. These ends were then made blunt-ended using 
the 3 • -exonuclease activity. of T4 DNA polymerase in 
the presence of the four dNTP's CT. Maniatis et al. 
in Molecular Cloning; Cold Spring Harbor Laboratory 
117-120 (1982)]. Circularisation of the linear 
plasmid was accomplished by using T4 DNA ligase. 

Plasmids from which the Sac I site present in the 
synthetic DNA fragment is removed are referred to 
as '04 type plasmids (for example: pUR 528-03 is 
changed into pUR 528-04; see Fig. 9), 

Similar results can be obtained with plasmids con- 
taining a structural gene for: one of the maturation 
forms of preprothaumatin, i.e. prethaumatinr pro- 
thaumatin and thaumatin, which will result in e.g. 
plasmids pUR 522-04, pUR 523-04 and pUR 520-04, 
respectively. 

5. Reconstitution of the original GAPDH promoter/ 

regulation region in plasmids encoding (pre) (pro) 
(pseudo)chymosin by introduction of a synthetic DNA 
fragment (Fig. 7B, Fig. 10). 

To construct (pre) (pro) (pseudo)chymbsin encoding 
plasmids containing the -03 type GAPDH promoter/ 
regulation region (see 4), use can be made of plas- 
mid pUR 528-03 digested with Sac I and Hind III as 
a vector molecule in which the various structural 
genes were inserted together with a synthetic DNA 
fragment (Fig. 10). The (pre) (pro) (pseudo)chyraosin 
encoding genes can be isolated from the plasmids 
pUR 1524, pUR 1523, pUR 1521 and pUR 1522 (cf. 
EP PA 77109) respectively by incubation with Sal I 
in combination with a partial Eco RI digestion to 
overcome cleavage of the additional Eco RI site in 
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in the chymosin gene (cf 3). The resulting DNA 
fragments can then be incubated with Mung bean 
nuclease (cf 2) followed by a digestion with Hind 
III. Restriction fragments containing the intact 
5 and isolated gene can be purified by electro- 

phoresis through a 1% agarose gel and then 
extracted from the gel. The synthetic DNA fragment 
to be used in the constructions is depicted in Pig. 
7B. 

10 

The vector molecule can be prepared by digestion of 
plasmid pUR 528-03 with Sac I and Hind III followed 
by a phosphatase treatment of the restriction frag- 
ments and isolation of the largest fragment from 

15 the 0.7% agarose gel. Vector molecule, synthetic 

DNA fragment (Sac I-treated) and the DNA fragment 
containing the (pre) (pro) (pseudo) chymosin encoding 
nucleotide sequence can be incubated with T4 DNA 
ligase and transformed in CaCl2-treated E. coli 

20 cells. Plasmid DNA obtained from ampicillin- 

resistant colonies can be tested by incubation with 
various restriction enzymes. 

The nomenclature of the newly created plasmids is 
25 similar to the nomenclature of the (pre) (pro) - 

thaumatin-encoding plasmids (pUR 1524-03, pUR 1523- 
03, pUR 1521-03 and pUR 1522-03). Removal of the 
Sac I site has been described also (see 4} • Removal 
of the Sac I sites will result in the plasmids pUR 
30 1524-04, pUR 1523-04, pUR 1521-04 and pUR 1522-04, 

respectively (see Fig. 10). 

6. DNA synthesis. 



35 



Desired oligonucleotides were synthesized on a 
polystyrene support using phosphotriester method- 
ology and a library of dimers. [G.A. van der Marel 
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etal. , Recl.Trav.Chim. Pays-Bas 101, 234-241 
(1982)]. Chloromethylated polystyrene (1.34 
mraoles/gram) was functionalized with 2-(4- 
hydroxylphenyl) ethanol CSu-Sun W^ng, J.AmlChem. 
5 Soc. 95, 1325-1333 (1973)3 and then coupled with 5* 

phosphorylated uridine protected as a mixture of 2' 
and 3' acetate and levulinyl groups (analogously to 
G.A. van der Marel et al. vide supra). This gave a 
support with a levulinyl functionality of 130 

10 /umoles/g). The synthesis cycle was carried out 

in a 20 ml 2-necked pear shaped flask, one neck of 
which carried a sintered glass filter. This allowed 
the functionalized polystyrene (60 mg) to remain in 
the flask throughout the synthesis which consisted 

15 of the following steps. (cf. Fig. 11) • 

1). Removal of the "levulinyl" group with hydrazine 
hydrate (0.5 mol/l) in propionic acid/pyridine 
(1:3 v/v) for 5 minutes. 
20 2). Washing with 2 x 2 ml pyridine. 

3 ) . Addition of a fourfold molar excess of the 

required dinucleotide anion (60 ^fumoles) 
protected in the 5' position as levulinyl 
ester. This was coevaporated twice with 

25 pyridine and reduced to approximately 0.5 ml 

before adding MSNT (240 /umoles) [C.B. Rees 
et al. . Tetrahedron Letters 2727-2730 (1978)]. 
The mixture was shaken for 60 minutes (except 
the first cycle which was lengthened to 90 

30 minutes). 

4) . Washing with 2 x 2 ml pyridine. 

5) . Addition of acetic anhydide (0.2 ml) and 

dimethyl aminopyridine in pyridine (1.5 ml, 
0.05 M) to react with any unreacted hydroxy 1 
35 group during S minutes. 

6) . Washing with 2 x 2 ml pyridine. 

This cycle was repeated with the appropriate 
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dimers until the desired sequence had been 
prepared. The 2- chlorophenyl phosphate 
protecting groups were removed with 0*3 moi/l 
1,1,4,4-tetraraethyl guanidinium 2- 
pyridinealdoximate (C.B. Reese et al. , vide 
supra ) in dry acetonitrile for 24 hours* After 
filtration and washing of the support the base 
protecting groups (dA = benzoyl, dC « anisoyl, 
dG = diphenylacetyl) were cleaved with 
concentrated aqueous ammonia for .60 hours at 
50 • which also cleaves the sequence from the 
uridine attached to the support CR- Crea and T. 
Horn, Nucleic Acids Res., 8, 2331-2348 (1980)]. 
Filtration of the aqueous phase from the 
support and evaporation gave a crude mixture 
which was given a clean-up by chromatography on 
a short column (40 cm) of Sephadex G50 with 
0.05 M triethylammonium bicarbonate as eluent. 
Collecting and evaporating the first five 
fractions containing UV absorbing material gave 
a concentrate suitable for preparative gel 
electrophoresis. 

Structural features of the GAPDH promoter/ regu- 
lation region (Pig. 2, Fig. 6) 

TATA or TATAAA sequences are believed to play an 
important role in positioning RNA initiation sites 
in eucaryotic promoter structures which are 
recognized by RWA polymerase II CC. Breathnach and 
P. Chambon, Annu. Rev. Biochem. 50, 349-384 

(1981) ]. Usually transcription is initiated 25 to 
30 nucleotides 3 ' to such sequences although for 
yeast varying distances (up to 70 nucleotides; M.J. 
Dobson etal. , l^ucleic Acids Res. 10, 2625-2637 

(1982) ) have been described. According to the 
nucleotide sequence data obtained during the 
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present investigations for the GAPDH promoter/regu- 
lation region, this structure contains two 
additional sets of TATA and TATAAA sequences which 
are located towards the ends of the promoter 
5 fragment (Fig. 2). Most likely the clustered. 

5'TATATAA 3 ' sequence occurring around position -130 
is responsible for transcription of the GAPDH 
encoding gene. The two other TATA and TATAAA 
sequences present around positions -608 and -770 

IQ are possibly involved in regulation of GAPDH 

expression (see below; P-K. Maitra and Z. Lobo, J. 
Biol- Chem. , 246, 489-499 (1971)). Besides these 
RNA initiation signals, the GAPDH promoter/regula- 
tion fragment contains various nucleotide sequences 

15 which are implicated in transcription termination. 

K.S. Zaret and F. Sherman [Cell 28, 563-573 (1982)] 
found the sequences TAG-N-TAGT-N ' -TTT or TAG-N^'- 
TATGT-N" ' -TTT, in which N|N\N" and N" ' represent 
the variable distances between the groups, in the 

20 3' flanking sequences of a majority of yeast genes 

examined. Identical or similar sequences occur in 
the GAPDH promoter/ regulation fragment around 
position -625 (TAT-N4-TAGT-N5-TTT) , around 
position -324 (TAC-Nj^3-TAGT-Njl7-TTA, around 

25 position -192 (TAC-Ng-TATGT-N^-TTT) and around 

position -180 (TAT-Nj^7-TAGT-N23-TTT) . In Fig. 6 
these postulated termination ' signals are repre- 
sented by slashes. The largest open translation 
reading frame present on the GAPDH promo ter/regula- 

30 tion fragment extends from position -450 to -337 

and encodes a peptide of 38 amino acids long. Since 
the TATA box around position -608 precedes the open 
reading frame it might well be that the putative 
peptide is translated from a transcript initiated 

35 downstream of this TATA sequence. Particularly 

interesting is the observation that the ATG 
initiation codon of the peptide forms part of a 
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secondary structure which is generated upon base 

pairing of the nucleotides extending from -448 to 

-436 with the nucleotides extending from -419 to 

-407. These features are reminiscent of the 

situation described for the yeast leu 2 gene [A. 

Andreadis et al* , Cell 31, 319-325 (1982)] and 

together with the presence of the small stem/ loop 

structure preceding the ATG codon of the GAPDH 

encoding gene they might be involved in regulating ^ 

the expression of the latter gene. 

8. Insertion of fragments of the GAPDH transcription 
termination/polyadenylation region in combination 
with the central transcription termination signal 
of phage Ml 3 RF downstream of genes encoding 
pseudochymosin (Fig. 12). 

To isolate fragments of the GAPDH transcription 
termination/polydenylation region, piasmid pFl-33 
was cleaved with restriction enzyme Af 1 II. Diges- 
tion with this enzyme yielded two fragments, the 
smaller of which has a length of 1307 nucleotides 
and encompasses the GAPDH termination/ polyadenyla- 
tion region from position 11 to 1317 (Fig. 12). The 
latter fragment was isolated, incubated with Mung 
bean nuclease to generate blunt ends (cf 2) and 
then equipped with the Bam HI linker 
( 5 ' CCGGATCCGG3 ' ) using T4 DtTA ligase. Owing to the 
presence of a naturally occuring Bam HI site in the 
middle (around position 690) of the Afl II fragment 
isolated, digestion of the ligation mix with Bain HI * 
resulted in two fragments (A and B; Fig. 12). The 
larger of these two fragments (A) has a length of ^ 
677 nucleotides and contains the nucleotide 
sequence region which is located immediately down- 
stream of the TAA translation termination codon. 
Attempts to subclone the purified fragment A in the 
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the Bam HI site of pBR322 proved to be unsuccess- 
ful since very few transf ormants were obtained and 
these transf ormants always contained fragment A as 
well as fragment B in various orientations I (cf. 
5 plasmid 294-17 depicted in Fig. 12). 

To overcome this instability of fragment A, use 
was made of the central transcription termination 
signal of bacteriophage M13 RF [L. Edens et al. , 

10 Nucleic Acids Res. 2, 1811-1820 (1975)]. M13 RF was 

digested with Taq I and the resulting fragments 
were made blunt end by incubation with Mung bean 
nuclease • The fragments were then equipped with a 
Hind III linker ( 5 ' AGAAGCTTCT3 • ) using T4 DNA 

15 ligase followed by a digestion with Hind III. The 

DNA was precipitated from the reaction mix by the 
addition of two volumes ethanol- The precipitate 
was resuspended and the various restriction 
fragments were separated by electrophoresis through 

20 a 4% acryl-amide gel. From this gel the 441 

nucleotides long fragment containing the central 
transcription termination signal was isolated. 

After cleavage of the purified fragment with Sau 
25 3A, the fragment containing the nucleotides 1509 to 

1717 [P.M.G. van Wezenbeek et al . , Gene 11, 129-148 
(1980)] was closed in pBR322 which had been 
digested with Hind III and Bam HI (Fig. 12). 

30 To create plasmids in which the 3 * -untranslated 

region of the genes encoding (pre) (pro) (pseudo) 
chymosin was replaced by the 3 ' -untranslated region 
of the GAPDH encoding gene, plasmid pUR 1521-02 was 
isolated from an E. coli strain deficient in 

35 adenine methylase and cleaved with restriction 

endonuclease Bel I. This enzyme recognizes the (un- 
methylated) sequence 5*TGATCA 3* and cleaves the 
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(pre) (pro) (pseudo)chymosiii encoding genes within 
their translation termination codon TGA. To 
generate suitable vector molecules in which both 
fragment A and the Ml 3 transcription termination 
5 signal could be cloned, the Bel I-cleaved plasmid 

molecules were incubated first with Sal I* followed 
by alkaline phosphatase. The larger of the two 
restriction fragments generated was isolated from 
the 0*7% agarose gel. For the isolation of fragment 

10 A, plasmid 294-17 was digested with Hind III and 

cleaved partially with Bam HI. The 1020 nucleotides 
long Hind III-Bam Hi fragment containing the Hpa I 
site was obtained by electro-elution from the 1% 
agarose gel* The latter fragment, the vector and 

15 the 485 nucleotide long Hind Ill-Sal I fragment 

obtained from pBR322 containing the M13 tran- 
scription termination signal were incubated with T4 
DNA ligase and transformed in CaCl2-treated E. 
coli cells. This finally yielded the plasmid pUR 

20 1521-12. Similar results can be obtained with plas- 

. mids pUR 1524-02, pUR 1523-02 and pUR 1522-02, 
which will result in plasmids pUR 1524-12, pUR 
1523-12 and pUR 1522-12, respectively. 

25 In the nomenclature plasmids containing fragment A 

and the M13 transcription termination signal down- 
stream of a newly inserted structural gene are 
indicated by replacing the addendum -Ox with -Ix. 

30 9. Introduction of the 2 micron DNA replication origin 
and the yeast, leu 2 gene in plasmids encoding 
thauma tin-precursors and chymosin-precursors (Fig. 
13, Pig. 14). 

35 The E. coli -yeast shuttle vector pMPSl [Fig. 13- ; 

CP. Hollenberg, Current Topics in Microbiol, and 
Immunol., 96, 119-144, (1982)] consists of plasmid 
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pCRI CC- Covey et al w MG6. 145, 155-158 (1976)3 
and a double Eco RI fragment of pJDB 219 CJ.D* 
Beggs, Nature, 275. 104-109 (1978)] carrying both 
the leu 2 gene and the yeast 2 micron DNA • 
5 replication origin. The latter two functions can 

be excised from pMP81 by a digestion with Hind III 
and Sal !• The resulting 4.4 kb long restriction 
fragment was introduced into the various pBR 322 
derivatives containing genes encoding thauraatin- 

10 precursors in combination with the various forms of 

the GAPDH promoter region of S. cerevisiae. A 
similar procedure was used to introduce the 4.4 kb 
long restriction fragment into the various pBR322 
derivatives containing genes encoding chymosin- 

15 precursors in combination with the GAPDH promoter 

region of S. cerevisiae . 

The introduction of the 2 micron replication origin 
and leu 2 gene containing Hind Ill-Sal I fragment 

20 into the various plasmids was accomplished by 

cleavage of the E. coli plasmids with Hind III and 
Sal I (cf . Pig* 14) and a subsequent treatment of 
the resulting fragments with phosphatase. After 
separation of the fragments by electrophoresis 

25 through a 1% agarose gel. the largest fragment was 

isolated, mixed with the purified Hind Ill-Sal I 
fragment obtained from pMPSl, ligated with T4 DNA 
iigase and transformed to CaCl2-treated E. coli 
cells. 

30 

In the pseudochymosin encoding plasmid containing 
termination fragment A (pUR 1521-12; cf. Pig. 12), 
the 4.4 kb long Hind Ill-Sal I fragment was in- 
serted as well, Por this purpose plasmid pUR 
35 1521-12 was digested with both Hind III and Sal I 

(to remove the Ml 3 transcription termination 
region) after which the 2 micron origin and leu 2 
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gene containing fragment from pMPBl could be in- 



(pURY 1521-12) proved to be stable in E. coli > 



plasmid DNA was isolated and subjected to restric- 
tion enzyme analysis. Plasmids containing the cor- 
rect insert were purified by CsCl-ethidium bromide 
density gradient centrifugation and used to trans- 
form S. cerevisiae AH22 (A. Hinnen et al. r 
Proc.Natl.Acad.Sci-USA 75, 1929-1933 (1978)) 
according to the procedure of J.D. Beggs, Nature 
275 / 104-109 (1978). This resulted in plasmids 
indicated by the letter code pURY but having the 
same figure codes (cf . Fig. 14 in which the con- 
version of plasmid pUR 528-03 into pURY 528-03 is 
indicated) • Similarly, plasmids pUR 522-02, pUR 
523-02, pUR 1524-02, pUR 1523-02 and pUR 1521-02 
were converted into their corresponding pURY plas- 
mids. 

With the availability of yeast transf ormants con- 
taining the newly constructed plasmids, the effects 
of plasmid variation on gene expression could be 
monitored. For this purpose an enzyme-linked im- 
munosorbent assay CElisa: A Voller et ai.. Bull. 
World Health Organ. 53, 55-65 (1976)3 for the 
thaumatin was developed and used to quantitate the 
amounts of thaumatin-like protein present in yeast 
extracts. The results obtained in these experiments 
show that upon the introduction of the -02 or -03 
type GAPDH promoter in pURY 528 (cf. 2, 4), 
thaumatin synthesis is increased by more than two 
orders of magnitude (more than 100 times). Upon the 
introduction of the 280 nucleotides long promoter 
fragment described by J- P. Holland and M.J. Holland 
CJ.Biol.Chem. , 255, 2596t2605, (1980)], however. 



serted. Unexpectedly, this plasmid construction 



From some of the ampicillin-resistant transf ormants 
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thaumatin synthesis increased by order of magnitude 
of only one (about 10 times), hereby demonstrating 
the important role played by upstream sequences of 
the GAPDH promoter in its functioning, which is 
contrary to the conclusions of Holland and Holland 
Ccf, J^Biol.Chem. 254, 9839-45 (1979)]. 

In another experiment the expression of prepro- 
thaumatin and prothaumatin encoding genes was com- 
pared. As expression of the prothaumatin encoding 
gene turned out to be almost negligible, this 
result clearly demonstrated the enormous impact 
which signal sequences can have on gene expression. 
The importance of the processing step was further 
substantiated by the results shown in Pig. 20. The 
latter experiment demonstrates that yeast cells 
harbouring piasmids encoding preprothauraatin are 
able to produce a thaumatin-like protein with a 
molecular weight which is practically the same as 
the molecular weight of thaumatin II molecules 
isolated from the plant. 

This suggests that yeast is able to correctly pro- 
cess the plant signal sequence. Recently obtained 
data on the amino-terminal amino acid sequence of 
yeast-synthesized thaumatin have provided definite 
evidence for this notion. 

In conclusion, the results obtained strongly sug- 
gest that the signal sequence plays an important 
role in either stimulating protein synthesis or in- 
creasing protein stability in yeasts. It is not yet 
sure whether prothaumatin or thaumatin is produced 
by yeasts in view of the smaller difference in 
molecular weight (about 3%) between these proteins, 
whereas the difference with prepro thaumatin (about 
10%) is easily detectable as was demonstrated in 
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Fig. 20* 

10 • The introduction of GAPDH transcription termin- 
ation/polyadenylation regions into pURY piasmids. 

5 

One of the possibilities to introduce GAPDH tran- 
scription termination/poiyadenylation regions into 
pURY piasmids is to use the unique Hind III re- 
striction site of these piasmids for insertion of 

10 various parts of the Af.l II fragment depicted in 

Fig. 12 • For example, the insertion of tran- 
scription termination fragment A (cf. Fig. 12) can 
be carried out as follows. Cleavage of plasmid pURY 
528-03 with Hind III and a subsequent Mung bean 

15 nuclease digestion (cf. 2) will yield a linearized 

and blunt-ended plasmid molecule (cf. Fig 14). In- 
cubation of this DNA with T4 DNA ligase and a suit- 
able Bam HI linker will equip the fragment with Bam 
HI sites (cf. a).. Upon transformation of the Bam 

20 Hl-cleaved and religated product into E. coii, pURY 

528-03 piasmids in which the Hind III site is re- 
placed by a Bam HI site can be recovered. Into this 
newly created Bam HI site transcription termination 
fragment A can be inserted. Digestion of the latter 

25 construction with Hpa I will indicate whether or 

not fragment A is inserted at the correct orien- 
tation, since in the 2 micron DNA sequence an 
additional Hpa I site is available. 

30 • Using the same app;roach but different linker mole- 

cules # different terminator fragments can be intro- 
duced at various sites in the plasmid molecule. 

11. Construction of an E. coli -yeast shuttle vector 
35 widely applicable for the expression of foreign 

genes in yeast (Fig. 15) . 



Derivatives of the E. coli- yeast shuttle vectors 
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described under 5 and 9, useful as generally appli- 
cable yeast expression vectors, csui be prepared. 
In these derivatives the GAPDH promoter/regulation 
region including a chemically synthesized DNA frag- 
5 ment can be used for ONA initiation, whereas tran- 

scription termination can be effected by the tran- 
scription termination/polyadenylation region of the 
"Able" operon of the 2 micron DNA. Insertion of 
foreign genes in these expression vectors can be 
10 accomplished by homopolyraer (poly dA) tailing of 

the Sac I site in the promoter and the Hind III 
site present in the "Able" operon in combination 
with poly dT tailing of the nucleotide sequence to 
be inserted. 

15 

For the preparation of the vector molecule, plasmid 
pURY 528-03 can be cleaved with Hind III and then 
incubated with Klenow DNA polymerase and the four 
dNTP's to generate a blunt-ended, linearized DNA 

20 molecule (Fig. 15). Subsequently this DNA molecule 

can be cleaved with Sac I 'and of the two resulting 
DNA fragments the larger can be isolated from a 
0.7% agarose gel and incubated with terminal 
transferase and dATP under conditions described by 

25 G. Deng and R. Wu [Nucleic Acids Res. 9, 4173-4188 

(1981)]. The time of incubation has to be such that 
the tail added to the 3* end generated by cleavage 
with Sac I has a length of about 20 dATP residues. 

30 The introduction of a foreign gene into the poly 

dA-tailed expression vector can be carried out by 
incubating the DNA containing this gene with a set 
of restriction enzymes such that the desired gene 
can be cleaved as close as possible upstream of the 

35 translation initiation codon and downstream of the 

translation termination codon of this gene. Since 
promoter regions can be preceded by transcription 
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termination signals, it is important that the ori- 
ginal promoter is not contained within the resul- 
ting DNA fragment* After purification, the trimmed 
DHA fragment can be equipped with poly dT tails. 

5 

If the gene to be inserted is obtained by reverse 
transcription of an mRNA molecule, the poly dT 
tails can be directly added to the SI nudease- 
treated double-stranded DNA molecule. In all cases 

10 the time of incubation with terminal transferase 

must be chosen such that poly dT tails with a 
length of about 20 nucleotides are generated. The 
DNA to be inserted and the poly dA- tailed vector 
molecule can then be hybridized by incubation at 

15 65*C for 10 minutes i followed by cooling down the 

hybridization mixture slowly to room temperature. 
The mixture can be subsequently transformed into 
Cacr2^treated E. coli cells. Plasmid DNA isolated 
from ampicillin-resistant transformants can then be 

20 used to transform yeast cells. 

12. Expression in K. lactis of the preprothaumatin 

encoding gene under control of the promoter/ regula- 
tion region of the glyceraldehyde-3 -phosphate 
25 dehydrogenase encoding gene of S. cerevisiae 

(Fig. 16, Fig. 17). 

The E. coli -yeast shuttle vector pEK 2-7 consists 
of plasmid YRp7 [D.T. Stinchcomb et ai.. Nature 
30 282 , 39-43 (1979)] containing the 1.2 kb KARS-2 

fragment. Owing to the presence of the yeast trp 1 
gene, plasmid pEK 2-7 can be maintained in K. 
lactis SDll (lac 4, trp 1; cf. pages 18 and 19 of 
European Patent Application N** 0 096 910; Al) . 



35 



To demonstrate the functionality of the promoter/ 
regulation region of the GAPDH encoding gene in 
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K. lactis, plasmid pUR 528-03 (Fig. 8) has been 
equipped with both KARS-2 and the trp 1 gene. The 
latter two functions were excised from pEK 2-7 by a 
digestion with Bgi II followed by ^the isolation 
5 from a 0.7% agarose gel of the smallest fragment 

generated. This purified fragment was then inserted 
in the dephosphorylated Bgl II cleavage site of pUR 
528-03 by incubation with T4 DNA ligase. Trans- 
formation of the ligation mix in CaCl2-treated E. 

10 coli cells yielded plasmid pURK 528-03 (Fig. 16). 

Transformants generated by the introduction of the 
latter plasmid into K. lactis SD 11 cells by the 
procedure described in European Patent Application 
N** 0 096 910; Al could be shown to synthesize pre- 

15 prothaumatin (Fig. 17). 

By techniques similar to those mentioned above, 
plasmid pURY 528-03 was also equipped with KARS-2 
and the yeast trp 1 gene and introduced into K. 

20 lactis SD 11 (Fig. 16 ). Using the same detection 

procedure, K. lactis cells carrying pURK 528-33 
could also be shown to synthesize preprothaumatin 
(Fig. 17). Preprothaumatin production in cells 
containing pURK 528-33 was, however, slightly 

25 higher than in ceils containing pURK 528-03. Since 

similar observations have been made by C. Gerbaud 
et al. [Gene 5, 233;253 (1979)] in the expression 
of the yeast ura 3 gene upon insertion of this gene 
within the coding region "Able" of the 2 micron DNA, 

30 it is very likely that the enhanced expression of 

preprothaumatin by pURK 528-33 is due to efficient 
transcription termination events in the tran- 
scription terraination/polyadenylation region of the 
"Able" operon. This observation indicates that the 

35 presence of an efficient transcription termination/ 

polyadenylation region downstream of a structural 
gene transcribed by the GAPDH promoter/regulation 

OMPI 
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region is an important factor in optimizing gene ^ 
expression* 

13. Integration of structural genes under the control 
5 of the GAPDH promoter/regulation region into the 

yeast chromosome (Pig. 18/ Fig. 19). 

Integration of DHA sequences encoding either a 
heterologous or homologous structural gene under 

10 transcriptional control of the GAPDH prcpmoter/ 

regulation region and, optionally, the GAPDH ter-- 
mination/polyadenylation region into the yeast 
genome, can be achieved on the basis of techniques 
described by R..J. Rothstein [in Methods in Enzym- 

15 ology 101, 202-211 (1983)] and K. Struhl [Nature 

305 , 391-397 (1983)]. The criteria to apply these 
techniques are the availability of (i) suitable 
marker genes, (ii) cloned DNA sequences homologous 
with DNA sequences present on the yeast genome and 

20 (iii) the availability of an intact, homologous or 

heterologous, protein-encoding DNA sequence. 

Having the marker genes (e.g. leu 2, trp 1, his 3) 
and the protein-encoding DNA sequence (e.g. the 

25 sequences encoding thaumatin-precursors or 

chymosin-precursors) available, the latter 
sequences can be integrated into the yeast genome 
by homologous recombination events either between 
GAPDH promoter and terminator sequences (cf. Fig. 

30 19) or between the marker genes. In the latter 

approach, which offers a variety of integration 
sites, the foreign protein encoding DNA sequence is 
integrated within the "wild type" marker gene, 
hereby destroying the function of this gene. 



35 



14. Chemical synthesis of structural gene. 

Construction of synthetically chimeric genes. 
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Expression experiments of the various structural 
genes of preprothaumatin and preprochymosin - 
located on the 2/uin DNA vector and under control 
of the GAPDH promoter /regulation region and the 
5 GAPDH termination region - in S. cerevisiae 

resulted in an expression yield that was considered 
not yet economically attractive for the fermen- 
tative production of these proteins. The expression 
of preprothaumatin and preprochymosin genes located 
10 on the KARS-vector did not match economic feasi- 

bility either. 

Another problem observed during expression 
experiments was that preprothaumatin was processed 

15 correctly in both S. cerevisiae and K. lactis into 

thaumatin (Pig. 20) (it is not clear whether the 6 
amino acids on the C-terminus are also removed 
during processing); however, such a correct pro- 
cessing could not be detected with preprochymosin. 

20 • 

Therefore it is advocated to make preprochymosin 
and preprothaumatin in the preferred codons of 
S. cerevisiae [J-P- Holland and M.J. Holland; J. 
Biol.caiem. 255, 2596-2605 (1980)3 to increase the 
25 expression-yield. The syntheses of these genes can 

be carried out according to the methods described 
under 4 and 6. The nucleotide sequences of chymosin 
and thaumatin in a preferred codon usage are given 
in Fig. 21 and Pig. 22. 

30 

As described under 4 and 6, the synthesis can be 
carried out by making small (up to 30 nucleotides) 
single strand DNA fragments with partial over- 
lapping sequences. This method makes it possible 
35 that also the various maturation forms of both 

genes are obtained simultaneously (cf. Fig. 21-22). 
Moreover, the approach adopted is suitable for 




PCT/EP84/00153 



making changes in parts of the sequences. One can 
use this possibility to replace the leader sequence 
of preprochymosin (nucleotides -174 to - 127, 
Fig. 21) by various other leader sequences. Two 
typical yeast leader sequences are that of acid 
phosphatase CK. Arima et al. , Nucleic Acids Res. 
11 , 1657-1672 (1983)3 and that of invertase [R- 
Taussig and M. Carlson, Nucleic Acids Res. 11^, 
1943-1954 (1983)]. Examples of nucleotide sequences 
encoding both leader sequences with codons pre-" 
f erred by yeasts are given in Fig. 23, whereas Fig. 
24 gives schematically two designed chimeric acid 
phosphatase/prochymosin and invertase/prochymosin 
genes. Because preprothaumatin was processed 
correctly by the yeast cells, we also designed a 
chimeric gene of prochymosin and the leader 
sequence of the. preprothaumatin gene (Fig. 24). 
Based on the physico-chemical considerations about 
the nature of the yeast leader sequences and the 
interaction of these leader sequences with the 
signal -recognition protein [P. Walter and G. 
Blobel, Proc. Natl. Acad. Sci. USA, 77, 7112-7116 
(1980)], we also designed two consensus leader 
sequences (Fig» 25) which can be used to make 
chimeric genes of these consensus sequences with 
the prochymosin gene (Fig. 24). 
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Legends to the figures 

Fig- 1 Restriction endonuclease cleavage map of a region of plasmid 
pFl 1-33 containing a yeast glyceraldehyde-3-phosphate 
dehydrogenase operon. 

Fig. 2 Nucleotide sequence of the promoter/ regulation region of a 
glyceraldehyde-3-phosphate dehydrogenase operon cloned in 
pFl 1-33. TATk and TATAAA sequences are indicated by solid 
underlining. Presumptive transcription termination signals are 
underlined with dots. Jfcideotide sequences between brackets 
• indicate inverted repeats. The nucleotide sequence encoding the 
38 amino acids long peptide is enclosed in a box. 

Fig. 3 Nucleotide sequence of the transcription termination/ 
polyadenylation region of a glyceraldehyde-3-phosphate 
dehydrogenase operon cloned in pFl 1-33. AATAA sequences are 
indicated by solid underlining. Presumptive transcription 
termination signals are underlined with dots. 

Fig. 4 Schematic representation of the insertion of the Eco RI (Dde I) 
GAPDH promoter/regulation fragment in the preprothaumatin 
encoding plasmid and removal of an Eco RI cleavage site from 
the resulting plasmid. 

Fig. 5 Schematic representation of the construction of (pre)(pro)- 

(pseudo)chymosin encoding plasmids containing the Eco RI (Dde I) 
GAPDH promoter/ regulation fragment. 
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Fig. 6 Schematic representation of the structure of the 6AFDH 

promoter/regulation region including potential stem and loop 
structures* Presumptive transcription termination signals are 
indicated by slashes. The jposition of the codiog sequence for 
the 38 amino acids long peptide on the fragment is shown hj ATG 
and TAA codons* 

Fig. 7a Representation of the various steps involved in the preparation 
of the synthetic DNA. fragment used to reconstitute the 
original GAPDH promoter/ regulation region upstream of pre- 
prothaumatin encoding nucleotide sequences. 

Fig. 7b Representation of the synthetic DNA fragment used to 

reconstitute the original GAPDH promoter/regulation region 
upstream of (pre) (pro) (pseudo}chymosin encoding nucleotide 
sequences. 

Fig. 8 Schematic representation of the introduction of the synthetic 
DNA fragment in preprothaumatin encoding plasmids* 



Fig. 9 Schematic representation of the removal of the Sac I site from 
the reconstituted GAPDH promoter/ regulation region. 

Fig. 10 Schematic representation of the introduction of the synthetic 
DNA fragment in (pre)(pro)(pseudo)chymosin encoding plasmids. 

Fig. 11 General scheme for synthesis of DNA fragments on a polystyrene 
support. 
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Fig. 12 Schematic representation of an Afl II-Bam HI transcription 
termination/ polyadenylation fragment obtained from pFl 1-33 
and its insertion in combination %rith the M13 central 
transcription termination signal, downstream of the nucleotide 
sequence encoding pseudochymosin 

Fig* 13 Schematic representation of plasmid pMPBl. 

Fig. 14 Schematic representation of the introduction of the 2 micron 
DKA origin of replication and the leu 2 gene obtained from 
pMFBl by digestion with Hind III an Sal I into preprothaumatin 
encoding plasmids. 

Fig. 15 Schematic representation of an E. coli-yeast shuttle vector 
widely applicable for the expression of foreign genes in 
yeast . 

Fig. 16 Schematic representation of the introduction of the KARS-2 and 
trp 1 gene obtained from pEK 2-7 by digestion with Bgl II into 
prepro-thaumatin encoding plasmid s. 

Fig. 17 Fluorogran of (^^S) labelled thaumatinrlike proteins 

synthesized by K. lactis SDll cells containing plasmid pEK 2*7 
(lane a)» plasmid pURK 528-03 (lane b) and plasmid pUBK 528-33 
(lane c) Yeast transformants were grown for 3 hours on a 
minimal medium containing ^^S-cysteine. Cells were collected 
by centrifugation, resuspended in 1 ml 2.0 mol/1 sorbitol, 
0.025 mol/1 NaPO^ pH 7.5, 1 mmol/1 EDTA, 1 mmol/1 MgCL2, 
2.5%/3-mercaptoethanol, 1 mg/ml zymolyase 60.000 and incubated 
for 30 minutes at 30^C. Spheroplasts were then centrif uged 
and lysed by the addition of llO/Al H2O, 4yl4l 100 mmol/1 
FMSF, 8/4I 250 mmol/1 EDTA, 40 ^1 9% NaCl and 80 JU^ 5x PBSTDS 
(50 mmol/1 NaPO^ pH 7.2, 5% Triton X 100, 2.5Z deoxycholate , 
2,5 Z SDS). Immunoprecipitation of thaumatio-like proteins and 
analysis of precipitated proteins was carried out as described 
by L. Edens et al. Gene 18, 1-12 (1982). 
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Fig. 18 Schematic represeatatioa of plasmld pEK 2^7, 

Fig. 19 Schematic representatioa of a plasmid vhich can be used for the 
insertion of a psetidochymosin-encoding nuclet>tide sequence 
under transcriptional control of the 6AFDH promoter/regulation 
and the GAFDH termination/ polyadenylatioa region into the yeast 
chromosome « 

Fig. 20 Fluorogram of(^^S) cysteine labelled thaumatin-like protein 

synthesized by S.cerevisiae AH 22 cells containing plasmid pURY 
528K^3 (lane c) or sythesised in vitro in a vheat germ protein 
synthesizing system under the direction of mRN& purified from 
arils of Thaumatococcus fruits (lane b). Lane a shows 
radioactive marker proteins (ohtained from Amersham) and lane d 
shows the position of thaumatin II isolated from the arils of 
Thaximatococcus fruits. Lysis of yeast cells was carried out 
as described in tlie legend of fig. 17. Vheat germ translation 
of mRNA and immunoprecipitation procedures were carried out 
as described by L. Edens et al» Gene 1£, 1-12 (1982). 

Fig. 21 Nucleotide sequence of the gene encoding (pre)(pro)(pseudo) 
chymosin in codon usage preferred hy S. cerevisiae. 

Fig. 22 Nucleotide sequence of the gene encoding (pre) (pro) thaumatin 
in codon usage preferred by S. cerevisiae. 

Fig. 23 Nucleotide sequenceM:he genes encoding the leader sequences 
acid phosphatase and invertase in codon usage preferred 
by S. cerevisiae. 

Fig • 24 Schematic representation of the. construction of the gene 

encoding prochymosin provided with the leader sequences of 
the invertase t acid phosphatase or thaumatin encoding genes, 
or with the two designed consensus leader sequences. 
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Fig. 25 Nucleotide sequences of the two designed consensus leader 
sequences. 
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CLAIMS 



1. 



A DNA sequence capable of initiating trans- 



10 



15 



20 



25 



30 



cription by yeast RNA polymerase II which includes at 
least part of the regulon region of one of the GAPDH 
genes of S. cerevisiae , characterized in that it com- 
prises a DNA sequence essentially as given in Fig« 2, 
and wherein the regulon region is optionally modified 
to include at least one restriction enzyme cleavage 
site, to facilitate manipulation of the nucleotide 
sequence region for protein synthesis initiation. 

2. A DNA sequence capable of both termination of 

the transcription by yeast RNA polymerase II and ef- 
fecting polyadenylation of the mRNA, which includes at 
least part of the termination/polyadenylation region 
belonging to one of the GAPDH genes of S. cerevisiae i 
characterised in that it comprises a DNA sequence 
es^sentially as 'given in Fig. 3. 

3* DNA sequence, which can be inserted into a re- 

combinant DNA plasmid or into a yeast chromosome,, com- 
prising: 

(a) a DNA sequence according to claim 1, and 

(b) one or more structural genes different from the 
GAPDH genes of S. cerevisiae , and at least two of 
features (c)-(f), 

(c) one or more specific DNA sequences capable of ter- 
minating the transcription by yeast RNA polymerase 
II and effecting polyadenylation of the mRNA, 
and/ or 

(d) one or more selection markers, and/or 

(e) either one or more nucleotide sequences allowing a 
stable insertion in a chromosome of yeasts or one 
or more DNA sequences which regulate DNA repli- 
cation in yeasts belonging to the genus Saccharo- 
myces or to the genus Kluy ver omyces , and/or 
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(£} a DNA sequence encoding a signal polypeptide of not 
more than 30 amino acid residues assisting the 
translocation of proteins, which DNA sequence is 
situated upstream of and in the same reading frstme 
5 as the structural gene. 

4. DNA sequence according to claim 3, character- 

ized in that the structural gene of (b) encodes thau- 
matin or chymosin/ or their various allelic or modified 
10 forms/ which modified forms do not impair the sweet- 
tasting properties of thaumatin or the milk-clotting 
properties of chymosin, respectively, or precursors of 
these thaumatin-like or chymosin-like proteins • 

15 5. DNA sequence according to claim 3, charac- 

terized in that the specific DNA sequence of (c) is a 
DNA sequence essentially as given in Fig. 3. 

6. DNA sequence according to claim 3, charac- 

20 terized in that the DNA sequence of (f) encoding a sig- 
nal polypeptide is selected from the group consisting 
of DNA sequences encoding 

(a) the signal polypeptide translocating cerevisiae 
invertase, namely Met. Leu»Leu.Gln. Ala.Phe.Leu.Phe.- 
25 Leu . Leu . Ala . Gly . Phe . Ala .Ala . Lys .lie . Ser . Ala ; 

(b) the signal polypeptide translocating S. cerevisiae 
acid phosphatase, namely Met.Phe.Lys.Ser- Val. Val.- 
Tyr • Ser • I le • Leu • Ala . Ala • Ser . Leu • Ala . Asn .Ala ; 

Cc) the signal polypeptide of unmatured forms , of 
30 thaumatin-like proteins, namely Met. Ala. Ala. Thr.- 

Thr Cys • Phe • Phe • Phe » Leu . Phe • Pro • Phe • Leu • Leu • Leu • - 
Leu • Thr • Leu . Ser • Arg . Ala ; 

(d) the signal polypeptide of unmatured forms of chym- 
osin-like proteins, namely Met .Arg. Cys .Leu .Val. - 

3 5 Val • Leu . Leu • Ala • Val . Phe . Ala . Leu . Ser • Gin • Gly; and 

(e) two consensus signal polypeptides, namely 

Met. Ser. Lys. Ala. Ala. Leu. Ala. Phe. lie. Ala. Phe. Val- - 
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Ile.Val.Leu. Ile.Val. Asn. Ala and 

Met.Ser.Lys.Phe.Val. Ile.Val-Leu.Ile.Val.Ala.Ala.- 
Leu. Ala*Phe.Ile« Ala.Asn*Ala. 



5 7. DNA sequence according to claim 3, charac- 

terized in that either the codons of the structural 
gene of (b) or the codons of the signal polypep tide- 
encoding DNA sequence of (f)# or both, are modified 
into codons preferred by yeasts. 

0 

8. DNA sequence according to claim 7, charac- 

terized in that as codons preferred by yeasts the 
following codons are used: 



GCC 


or 


OCT 


alanine 


TTG 






leucine 


A6A 






arginine 


AAG 






lysine 


AAC 






asparagine 


ATG 






methionine 


GAC 


or 


GAT 


aspartic acid 


TTC 






phenylalanine 


TGT 






cysteine 


CCA 






proline 


CAA 






'glutamine 


TCC 


or 


TCT 


serine 


GAA 






glutamic acid 


ACC 


or 


ACT 


threonine 


GOT 






glycine 


TGG 






tryptophan 


CAC 






histidine 


TAC 






tyrosine 


ATC 


or 


ATT 


isoleucine 


GTC 


or 


GTT 


valine* 



25 9. Process for preparing yeasts containing rDNA 

sequences, characterized in that DNA sequences as 
claimed in claim 3 are introduced into Saccharomyces , 
Kluyveromyces , Debaryomyces , Hansenula, Candida, 
Torulopsis or Rhodotorula yeasts, either in the form of 

30 plasmids or by incorporation in the yeast genome. 

10. Yeast containing an rDNA sequence as claimed 

in claim 3, either in the form of a plasmid or in- 
corporated in the yeast genome. 
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11. Process for preparing a protein by culti^ 
vation of a yeast containing rDNA sequences, charac- 
terized in that a yeast as claimed in claim 10 is used 
to produce a protein, whereby the rDNA sequence coh- 

5 tains a structural gene encoding that protein or a 

precursor thereof which precursor during processing can 
form the relevant protein. 

12. Protein, produced by a process according to 
10 claim 11. 
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Fig.3. 
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67 


77 




CATGACATCT 


CATATACAC6 


TTTATAAAAC 


127 
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147 


CTCAG6GATT 


CGATTTTTTT 


GGAAGTTTTT 


187 


197 


207 


TGGGAACAAT 
247 


TATACAATCG 


AAAGATATAT 


257 


267 


TTAICCTATA 


6TAACATAAC 


CTGAAGTATA 


307 


317 


327 


ATGAGAACTC 


TGTGAATAAT 


TAGGCCACT6 


367 


377 


387 


TATCTTCGAT 


AAAGCACTTA 


GTATCACACT 




447 


G6TGATTTCC 


AAGTATT6TT 


TCCAA6CATC 


487 


497 


507 


CGTTTTCATC 


GCATATCTGT 


CCATTATTTC 
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557 


567 
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CTCCTAGCAG 


TTAACATTTC 


607 


617 


627 


TCAACTCTTC 


CTTGCTTACC 


GACGTACCCG 


667 


677 


687 
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TACAATGAAT 


AAATTGTTCG 


727 


737 


747 


CAATGCAAC6 


ATCAACATCA 


ACGCTGTTAT 


787 


797 


807 


TATCTAAAGA 


AATTTCTCTC 


CATTTCAAAG 


847 


857 
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AGGAATGTTC 


AGCCATATCA 


GTGTCATGAT 


907 
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CTTCTGTATT 
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1087 


1097 


1107 


AGTAGGCCAT 


AGTGGGTCCA 


CAATACCTGT 
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37 


47 


57 


ATTTAGTTTT 
9f 


TATTAT7AAT 


AATTCATGCT 


107 


117 


TTAAATAGAT 


TGAAAATGTA 


TTAAAGATTC 


157 


167 


177 


GTTTTTTTTT 
217 


CCTTGAGATG 
227 


CTGTAGTATT 

••••••••• 9U 

237 


GCTTACATTC 


GACCGTTTTA 


GGCGTGATCA 


277 


287 


297 


ACTGACACTA 


CTATCATCAA 


TACTTGTCAC 


337 


347 


357 


AAATTTGATG 


CCTGAAGGAC 


C6GCATCACG 


397 


407 


417 


AATTGGCTTT 


TCGCCGCATA 


TGGTGTTTCC 




467 


477 


GTACCTTTCA 


CCATTTGGAG 


TATCACTTAG 


517 
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707 
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757 


767 


77 7 


GA6AAACCAT 


CATGGGAATT 


ACCTTCACCG 


817 


827 


837 


TTTCCACCAA 


CATGGGGAGC 


TGCATCTCTA 


877 


887 


897 


CCATTGGCTT 


AAACAGCTTC 


MB Mi HI ^% M% m fM 

TTTCCGTTCT 


937 


947 


a c ? 
9 5 7 


ACAAGTCTGT 


ATCCACTTTC 


AGATTACCCA 


997 


1007 


1017 


CTAAAATTTG 


GTTTTTGAAA 


TAGATGTb AT 


1057 


1067 


1077 


CATTACGGTT 


CAAATCCAAC 


GATGCGTACG 


1117 


1127 


1137 


AACCGGCAT6 


AGGACATATG 


ATAATTCTGG 


1177 


1187 


1197 


TGATCAAGTA 


TGTATGCGGT 


TGTTGAGATA 


1237 


1247 


1257 


CTGCATGTGT 


TGCTGGACGT 


ATTGACATGT 


1297 


1307 


1317 


CACCTAAGCC 


ACCGACTAGG 


ACCACTTCAC 
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pFll-33 
contains GAPDH gene 
in 9 kb insert 



isolate 4.3 kb GAPDH 
promoter containing fragment 



I' 



Dde I 



isolate Pvu II site-containing 

promoter fragment 

Ddel Bgllt Pvu/I Ddel 



Klonow P(^l • I| 
dNTP ' s ^ 



HccRI linker. 




FigA 
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Fig. 7a. 



Sac I 

5« CCC.TTA.GTT.TCA.AAT.TAA.AGA.GCT.CAT.CAC 3' 

3' TCT.CGA.GTA.GTG.TGT.TTG.TTT.GTT.TTG.TTT 5' 



Dde I 



J 

Sac I 



Klenow DNA-polynerase 
dNTP's 



5* CCC.TTA.GTT.TCA.AAT.TAA.AGA.GCT.CAT.CAC.ACA.AAC.AAA.CAA.AAC.AAA 3' 
3' GGG.AAT.CAA.AGT.TTA.ATT.TCT.CGA.GTA.GTG.TGT.TTG.TTT.GTT.TTG.TTT 5' 



Dde I 



Sac I 



y TTA.GTT,TCA.AAT.TAA.AGA.GCT.CAT.CAC.ACA.AAC.AAA.CAA.AAC,AAA 3' 
3' CAA.AGT.TTA.ATT.TCT.CGA.GTA.GTG.TGT.TTG.TTT.GTT.TTG.TTT 5' 

Sac I 

DNA-polymerase, dNTP's 
DNA ligase 

5' TTA.GTT.TCA.AAT.TAA.AGC.ATC.ACA.CAA.ACA.AAC.AAA.ACA.AA 3' 
3' CAA.AGT.TTA,ATT.TCG.TAG.TGT.GTT.TGT.TTG.TTT.TGT-TT 5' 



Fig.7b. 



5' A.GCT.CAT.CAC.ACA.AAC.AAA.CAA.AAC.AAA 3* 
3' TA.GTC.TGT.TTG-TTT.GTT.TTG.TTT 5* 
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Fig. 8. 
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Fig. 12. 
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Fig. 16. 
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Hg. 21. 

LEADER SEQUENCE PREPROCHYHOSIN IN $• CER. PREFERRED CODONS • 

ATGAGATGTT TCCTTGTTTT GTTGGCTGTT TTCGCTTTGT CCCAAGGT 
-174 -165 -155 -145 -135 -127 



DNA SEQUENCE PRO PART PRECEEDING PSEUDOCHYMOSIN IN S. CER. PREFERRED CODONS 

GCTGAAATTA CTAGAATTCC ATTGTACAAC GCTAAGTCCT TGACAAAGGC TTTGAAGGAA 
-126 -116 -106 -96 -86 -76 

CACGGTTTGT TGGAAGACTT C 
-66 -56 -46 



DNA SEQUENCE PSEUDO PART PRECEEDING CHYMOSIN IN S. CER PREFERRED CODONS 

TTGCAAAACC AACAATACGG TATTTCCTCC AAGTACTCCG GTTTC 
-45 -36 -26 -16 -6 -1 



DNA SEQUENCE CHYMOSIN IN S. CER, PREFERED CODONS 

10 20 30 40 50 60 

GGTGAAGTTG CTTCCGTTCC ATTGACTAAC TACTTGGACT CCCAATACTT CGGTAACATT 

70 80 90 100 110 120 

TACTTGGGTA CTCCACCACA AGAATTCACT GTTTTGTTCG ACACTGCTTC CTCCCACTTC 
130 140 150 160 170 180 

TGGGTTCCAT CCATTTACTG TAAGTCCAAC GCTTGTAAGA ACCACCAAAG ATTCGACCCA 
190 200 210 220 230 240 

AGAAAGTGCT CCACTTTCCA AAACTTGGGT AAGCCATTGT CCATTCACTA CGGTACXGGT 
250 260 270 280 290 300 

TCCATCCAAG CTATTTTGGG TTACGACACT CTTACTGTTT CCAACATTGT TGACATTCAA 

310 320 330 340 350 360 

CAAACTCTTG CTTTGTCCAC TCAAGAACCA CGTGACGTTT TCACTTACGC TGAATTCGAC 

370 380 390 400 410 420 

CGTATTTTGG GTATGGCTTA CCCATCCTTG CCTTCCGAAT ACTCCATTCC AGTTTTCGAC 

430 440 450 460 470 480 

AACATCATGA ACAGACACTT CCTTGCTCAA GACTTGTTCT CCCTTTACAT GGACAGAAAC 

490 . 500 510 520 530 540 

GGTCAAGAAT CCATGTTCAC TTTGCGTGCT ATTGACCCAT CCTACTACAC TGGTTCCTTG 

550 560 570 580 590 600 

CACTGCGTTC CACTTACTCT TCAACAATAC TGCCAATTCA CTGTTGACTC CGTTACTATT 

510 620 630 640 650 660 

TCCGGTGTTG TTGTTGCTTC TCAAGGTGGT TGTCAAGCTA TTTTGGACAC TGGTACTTCC 

670 680 690 700 710 720 

AAGTTCGTTG GTCCATCCTC CGACATTTTG AACATTCAAC AAGCTATTGG TGCTACTCAA 

730 740 750 760 770 780 

AACCAATACC GTCAATTCGA CATTGACTGT GACAACTTGT CCTACATGCC AACTGTTGTT 

790 800 810 820 830 o«iu 

TTCGAAATTA ACGGTAACAT GTACCCATTG ACTCCATCCC CTTACACTTC CCAAGACCAA 

850 860 870 880 890 900 

GGTTTCTGTA CTTCCCGTTT CCAATCCGAA AACCACTCCC AAAAGTGGAT TTT6CCTGAC 

910 920 930 940 950 960 

GTTTTCATTA GACAATACTA CTCCGTTTTC CACACAGCTA ACAACTTCCT TGGTTTGCCT 

969 

AAGGCTATT _ 

( _ OMPI > 
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Fig.22, 



LEADER SEQUENCE OF PREPROTUAUHATIN IN S.CER. PREFERRED COOONS 

ATCCCTCCTA CTACTTGTTT CTTCTTCTTG TTCCCATTCT TGTTGTTGTT GACTTTCTGC 
-66 -57 -47 -37 -27 -17 -7 

AGAGCT 
-1 



DNA SEQUENCE MATURE THAUMATIN IN S. CER. PREFERRED CODONS 



10 


20 


30 


GCTACTTTCC 


AAATTGTTAA 


CAGATGTTCC 


70 


80 


90 


CACCCTGCTT 


TGGACGCTGG 


TGGTAGACAA 


130 


140 


150 


GTTGAACCAG 


GTACTAAGGG 


TGGTAAGATT 


190 


200 


210 


TCCGGTAGAG 


CTATTTGTAC 


AACTGGTGAC 


250 


260 


270 


GGTAGACCAC 


CAACTACTTT 


GGCTGAATTC 


310 


320 


330 


GACATTTCCA 


ACATTAAGGG 


TTTCAACCTT 


370 


380 


390 


TGIAGAGGTG 


TTAGAT6T6C 


TGCT6ACATT 


430 


440 


450 


CCAGGTG6TG 


GTTGTAACGA 


CGCTTGTACT 


490 


500 


510 


ACTGGTAAGT 


GTGGTCCAAC 


TGAATACTCC 


550 


560 


570 


TTCTCCTACG 


TTTTGCACAA 


GCCAACTACt 


610 


620 




GTTACTTTCT 


GTCCAACX6C 


T 



40 


50 


60 


TACACTGTCT 


GGGCTGCTGC 


TTCCAAGGGT 


100 


110 


120 


TTGAACTCCG 


GTGAATCCTG 


GACTATTAAC 


160 


170 


180 


TGGGCTAGAA 


CTGACTCTTA 


CTTCGACGAC 


220 


230 


240 


TGTGGTGGTT 


TGTTGCAATG 


TAAGAGATTC 


280 


290 


300 


TCCTTGAACC 


AATACGGTAA 


GGACTACATT 


340 


350 


360 


CCAATGTACT 


TCTCCCCAAC 


TACTAGAGGT 


400 


410 


420 


CTTGGTCAAT 


GTCCAGCTAA 


GTTGAAGGCT 


460 


470 


480 


GTTTTCCAAA 


CTTCCGAATA 


CTGTTGTACT 


520 


530 


540 


ACATTCTTCA 


AGAGATTGTG 


TCCAGACGCT 


580 


590 


600 


GTTACTTGTC 


CAGGTTCCTC 


CAACTACAGA 



DNA SEQUENCE ACIDIC PEPTIDE OF PROTHAUMATIN IN S.CER. PREFERRED CODONS 



622 632 
TTG6AATT6G AAGACGAA 



^ OMPI ^ 
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LEADER SEQUENCE ACIDIC PHOSPHATASE IN YEAST PREFERRED CODONS 

ATGTTCA-AGT CCGTTGTTTA CTCCATTTTG GCTGCTTCCT TGGCTAACGC T 
-51 -41 -31 -21 -11 -1 



LEAJ5ER SEQUENCE INVERTASE IN YEAST PEEFERED CODONS 

ATGTTGTTGC A-AGCTTTCTT GTTCTTGTTG GCTGGTTTCG CTGCTAAGAT TTCCGCT 
-57 -47 -37 -27 -17 -7 -1 



Fig. 23. , 
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CONSENSUS LEADER SEQUENCE (A-B) IN YEAST PREFERRED CODONS 

ATGTCCAAGG CTGCITTGGC TTTCATTGCT TTC6TTATTG TTTTGATT6T TAACGCT 
-48 -38 -28 -18 -8 -I 



CONSENSUS LEADER SEQUENCE (B-A) IN YEAST PREFERRED CODONS 

ATGTCCAAGT TCGTTATTGT TTT6ATTGTT CCTGCTTTGG CTTTCATT6C TAACGCT 
-48 -38 -28 -18 -8 -1 
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